Group 4 - CV 2 May 22 A

• DOMAIN: Automotive Surveillance.

• CONTEXT: Computer vision can be used to automate supervision and generate action appropriate action trigger if the event is predicted from the image of interest. For example a car moving on the road can be easily identified by a camera as make of the car, type, colour, number plates etc.

• DATA DESCRIPTION: The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe.

‣ Train Images: Consists of real images of cars as per the make and year of the car.

‣ Test Images: Consists of real images of cars as per the make and year of the car.

‣ Train Annotation: Consists of bounding box region for training images.

‣ Test Annotation: Consists of bounding box region for testing images. Dataset has been attached along with this project. Please use the same for this capstone project.

Original link to the dataset for your reference only: https://www.kaggle.com/jutrera/stanford-car-dataset-by-classes-folder [ for your reference only ]

Reference: 3D Object Representations for Fine-Grained Categorisation, Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei 4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.

• PROJECT OBJECTIVE: Design a DL based car identification model.

• PROJECT TASK: [ Score: 100 points]

1. Milestone 1: [ Score: 40 points]

Input: Context and Dataset

Process:

Step 1: Import the data. [ 3 points ]

Let us map the google drive and get the path

assign the image folder path

Extract the files from zip folder

Get the total number of training and testing Classes(folders indicating type of car and print the total

Inference : We have downloaded the files, created test and train folder. Clearly there are 196 different classes in both train and test dataset

Step 2: Map training and testing images to its classes. [ 6 points ]

Let us now see how many images are there in total in both test and train folders

Let use annotation data and assign them to correct images based on image name and mapping

Inference : As we can see there are 8144 training and 8041 test images.

Step 3: Map training and testing images to its annotations. [ 6 points ]

Inference : We have mapped the image names to that of their bouding boxes

Step 4: Display images with bounding box. [ 5 points ]

Inference : Clearly, we can see that bouding boxes are correctly mapped to the image names and image names are also printed as labels

EDA

Before buliding a model, let us do some exploratory data analysis on the given dataset

Let us create a dataframe and assign the train images so we can perform some EDA easily and let us do this for TEST data first

Let us extract the first word that indicates the company name from the column Car Class

Let us group the dataframe by Car Name and get the total by Car Company

Inference : We can see that there are about 49 different companies which makes car. On an average, each company has 166 cars with 905 being maximum and a company making just about 29 cars. More than 75% of these companies have 171 cars. There is some imbance in classes data due the large number of cars available by one single company with 905 cars and a company with just 29 cars. But, for this project we shall not balance this data.

Let us see the top 10 Car Companies from the given data

Inference : Clearly, Chevrolet is ranked 1 with more than 905 cars and Bentley at #10 with 238 Cars. These 10 cars have average of 218 cars, which is singificantly higher than average of 171 cars from all data. This indicates some imbalance and hence let us see top min companies

Inference : We can clearly see the last 10 companies with Maybach being the least of all with 29 cars and Porsche at 10th position with 44 cars. There is no imbalance in these 10 comapanies data, however, from a overall data perspective, there can be some imbalance due to very minimal number compared to top 10 companies

Conclusion : Clearly, we have seen the top 10 companies and last 10 companies

There is a bit of imbalance due to significant contributions from top 10 companies

We can balance the data by removing some classes from top companies, which have minimal cars

we can also ignore last 3 companies with very minimal cars so we can try to balance the data.

For this stage, we are not considering balancing of data but go with data

Let us plot a frequency distribution

Inference : The distribution seems okay with few outlier peaks and having very less samples for some clasess. As discussed above, we are not targetting to balance the data for current scope

before that, let us get classes which have only year is different and for that let us remove the year part of the folders and see if we have same folders

let us get the duplicates w/o dropping the entries

We can see that there are multiple classes with just year being different and hence before we proceed, We shall see if the images in different years are significatnly similar

for that let us plot Audi S4 Sedan which has two classes as 2012 and 2007

Inference : Clearly we can see that there is not much changes wrt car design from 2007 model to 2012 cars. We may combine these classes for better predictability

Data preparation

As we can see from above, we have 7 classes that are having just the year as different and hence let us try to merge them into one folder

In order to get two row at a time, we shall shift path column and add another column to get path of both the folders (eg : Audi S4 Sedan 2012 and Audi S4 Sedan 2007 have different paths but we need both of them for merging)

Let us merge both the folders

Let us verify if the classes (merged folders) for count of cars.

we had Audi S4 Sedan 2007 having 40 images and Audi S4 Sedan 2012 having 45 counts so, our merged folder should have 85 and the other folder should have no images

Inference : now we have merged the images into first folder and since second folder will have no cars, it will be ignored

Let us now perform the same steps of test classes so that we do not create imbalance in categories

we should note that the same car classes have year as different in test data also and total number is 7

Inference : we have merged the test folders which have just year as different

Let us now map the Annotatins to images from new folder structure

Let us also plot some of the random images from both test and train folders

Step 5: Design, train and test basic CNN models to classify the car. [ 10 points ]

Let us build a Convolutional Neural network with 196 classes with just few layers and softmax

Let us plot the curves

Inference : With 10 epochs and basic few layers, our basic CNN model has a test accuracy of 2.2% and definitely this approach is not at yeilding anything for us. We need to effectively use the annotations to mask and use a different model for better accuracy

Step 6: Interim report [ 10 points ]
Submission: Interim report, Jupyter Notebook with all the steps in Milestone-1

Interim Report

Introduction:

The goal of this project is to design a deep learning-based car identification model using computer vision techniques. We will be using the Cars dataset, which contains 16,185 images of 196 classes of cars. The dataset is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe.

Milestone 1:

For milestone 1, we have completed the following steps:

Step 1: Import the data

We have imported the Cars dataset, which contains the following files:

Train Images: Consists of real images of cars as per the make and year of the car.

Test Images: Consists of real images of cars as per the make and year of the car.

Train Annotation: Consists of bounding box region for training images.

Test Annotation: Consists of bounding box region for testing images.

Step 2: Map training and testing images to their classes

We have mapped each training and testing image to its respective class. This was done by reading the annotation CSV files and creating a dictionary that maps the image name to its class.

Step 3: Map training and testing images to their annotations

We have mapped each training and testing image to its respective annotation, which consists of the bounding box coordinates and image class. This was done by reading the annotation CSV files and creating a dictionary that maps the image name to its bounding box and class.

Step 4: Display images with bounding box

We have written a script to display random images from the training dataset with their bounding boxes. This was done to get a better understanding of the dataset and to ensure that the annotations are correct.

Step 5: Design, train and test basic CNN models to classify the car

We have designed a basic CNN model to classify the car images. The model consists of 3 convolutional layers, followed by 1 fully connected layer and a softmax output layer. We have trained the model for 10 epochs and achieved an accuracy of 98.9% on the train set. The test accuracy is very low around 2.3% only.

Step 6: Interim report

We are currently working on the interim report, which includes a detailed description of the project, the dataset, and the progress made so far.

Conclusion:

We have made good progress in milestone 1, completing all the required steps. We have imported the dataset, mapped the images to their classes and annotations, displayed images with their bounding boxes, and designed a basic CNN model for car classification. In the next milestone, we will fine-tune the basic CNN model and design an RCNN-based object detection model.

2. Milestone 2: [ Score: 60 points]

Input: Preprocessed output from Milestone-1

Process:

Step 1: Fine tune the trained basic CNN models to classify the car. [ 5 points ]
Step 2: Design, train and test RCNN & its hybrids based object detection models to impose the bounding box or mask over the area of interest. [ 10 points ]
Step 3: Pickle the model for future prediction [ 5 Points]
Step 4: Final Report [40 Points]
Submission: Final report, Jupyter Notebook with all the steps in Milestone-1 and Milestone-2

3. Milestone 3: [ Optional ]

Process:

Step 1: Design a clickable UI based interface which can allow the user to browse & input the image, output the class and the bounding box or mask [ highlight area of interest ] of the input image
Submission: Final report, Jupyter Notebook with the addition of clickable UI based interface